Fix Energon data loading incompatibility with updated Qwen3-VL finetuning pipeline by aub123 · Pull Request #2680 · NVIDIA-NeMo/Megatron-Bridge

aub123 · 2026-03-06T09:46:42Z

Background

The data loading logic for Energon-format datasets is not compatible with the updated Qwen3-VL finetuning pipeline.

Recent updates changed the expected multimodal sample structure, which causes mismatches when loading Energon datasets.

Changes

Adjust Energon data loading logic in qwen3_vl_bridge.py
Align multimodal sample parsing with the updated Qwen3-VL finetuning interface
Ensure Energon datasets can be used directly in the current training pipeline

Notes

This change focuses on compatibility with the new Qwen3-VL finetuning logic and does not modify the Energon dataset format itself.

Tested on Qwen3-VL-8B-Instruct model.

Summary by CodeRabbit

Documentation
- Enhanced README with improved formatting, structure guidance, and inference instructions for Qwen VL model examples
Bug Fixes
- Corrected image file extension mapping in Energon dataset configuration

copy-pr-bot · 2026-03-06T09:46:45Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

…ng pipeline Signed-off-by: aub123 <2546319206@qq.com>

coderabbitai · 2026-03-06T10:03:23Z

📝 Walkthrough

Walkthrough

Updates image and video handling in a Qwen VL model example by introducing a new tensor conversion method in the task encoder and correcting the file extension mapping in the dataset documentation from jpg to jpgs.

Changes

Cohort / File(s)	Summary
Documentation `examples/models/vlm/qwen3_vl/README.md`	Added formatting blocks (directory structure, import/export steps, notes) and changed field_map entry from `imgs: jpg` to `imgs: jpgs` for correct image extension mapping.
Task Encoder Implementation `src/megatron/bridge/recipes/qwen_vl/data/energon/task_encoder.py`	Introduced private method `_convert_to_tensor` in `videohandler` class to handle numpy array conversion to CHW float32 tensors and raw byte fallback; refactored image and video decoding logic to use the new conversion method.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Results For Major Changes	⚠️ Warning	PR contains significant changes to Energon data loading pipeline and tensor conversion logic, but provides no documented test results, metrics, or verification details despite claiming testing was performed.	Add detailed test results to PR description including loss curves, convergence metrics, specific test cases, and performance comparisons on Qwen3-VL-8B-Instruct model to demonstrate compatibility without regression.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Fix Energon data loading incompatibility with updated Qwen3-VL finetuning pipeline' directly and accurately summarizes the main change: updating Energon dataset loading to work with the updated Qwen3-VL pipeline, which aligns with the documented PR objectives.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

src/megatron/bridge/recipes/qwen_vl/data/energon/task_encoder.py (2)

216-216: extensions_mapping is now unused.

The self.extensions_mapping dictionary is no longer used after this refactor. Previously it was used to map extensions to keys for imagehandler, but now _convert_to_tensor always passes "jpg" directly. Consider removing the unused attribute to avoid confusion.

🧹 Proposed cleanup

     def __init__(self, imagespec):
         self.extensions = ["jpgs", "mp4s", "videos"]
-        self.extensions_mapping = {"jpgs": "jpg", "mp4s": "jpg", "videos": "jpg"}
         self.image_handler = imagehandler(imagespec)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/megatron/bridge/recipes/qwen_vl/data/energon/task_encoder.py` at line
216, Remove the unused attribute self.extensions_mapping from the TaskEncoder
class: locate where it is defined (self.extensions_mapping = {"jpgs": "jpg",
"mp4s": "jpg", "videos": "jpg"}) and delete that line since _convert_to_tensor
now always passes "jpg" directly and imagehandler no longer relies on it; ensure
no other code references extensions_mapping (search for extensions_mapping)
before committing.

219-231: Add type hints for the new method.

Per coding guidelines, functions should have type hints for arguments and return types. This improves readability and enables static type checking.

💡 Suggested type hints

-    def _convert_to_tensor(self, data):
-        """Convert numpy array or bytes to tensor.
-        
-        The wds conversion script stores images as numpy arrays (HWC, uint8),
-        so we need to handle both numpy arrays and raw bytes.
-        """
+    def _convert_to_tensor(self, data: np.ndarray | bytes) -> torch.Tensor:
+        """Convert numpy array or bytes to tensor.
+
+        The wds conversion script stores images as numpy arrays (HWC, uint8),
+        so we need to handle both numpy arrays and raw bytes.
+
+        Args:
+            data: Image data as numpy array (HWC, uint8) or raw bytes.
+
+        Returns:
+            Tensor in CHW format with float32 values in [0, 1].
+        """
         if isinstance(data, np.ndarray):

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/megatron/bridge/recipes/qwen_vl/data/energon/task_encoder.py` around
lines 219 - 231, Add type hints to _convert_to_tensor: annotate the data
parameter as Union[np.ndarray, bytes] and the return type as torch.Tensor (i.e.,
def _convert_to_tensor(self, data: Union[np.ndarray, bytes]) -> torch.Tensor).
Also ensure typing Union is imported (from typing import Union) and that
image_handler's return type is compatible with torch.Tensor; update
image_handler signature or cast its result to torch.Tensor if needed. This
change applies to the _convert_to_tensor method in task_encoder.py and any
related image_handler definition used by this method.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/megatron/bridge/recipes/qwen_vl/data/energon/task_encoder.py`:
- Line 216: Remove the unused attribute self.extensions_mapping from the
TaskEncoder class: locate where it is defined (self.extensions_mapping =
{"jpgs": "jpg", "mp4s": "jpg", "videos": "jpg"}) and delete that line since
_convert_to_tensor now always passes "jpg" directly and imagehandler no longer
relies on it; ensure no other code references extensions_mapping (search for
extensions_mapping) before committing.
- Around line 219-231: Add type hints to _convert_to_tensor: annotate the data
parameter as Union[np.ndarray, bytes] and the return type as torch.Tensor (i.e.,
def _convert_to_tensor(self, data: Union[np.ndarray, bytes]) -> torch.Tensor).
Also ensure typing Union is imported (from typing import Union) and that
image_handler's return type is compatible with torch.Tensor; update
image_handler signature or cast its result to torch.Tensor if needed. This
change applies to the _convert_to_tensor method in task_encoder.py and any
related image_handler definition used by this method.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 598a6015-fbb7-4a34-ac42-89d0aba24cdd

📥 Commits

Reviewing files that changed from the base of the PR and between c15303e and 9603399.

📒 Files selected for processing (2)

examples/models/vlm/qwen3_vl/README.md
src/megatron/bridge/recipes/qwen_vl/data/energon/task_encoder.py

github-actions bot added the community-request label Mar 6, 2026

This comment was marked as off-topic.

Sign in to view

Fix Energon loading logic incompatible with updated Qwen3-VL finetuni…

9603399

…ng pipeline Signed-off-by: aub123 <2546319206@qq.com>

aub123 force-pushed the fix/energon-qwen3vl-loader branch from 2530b0e to 9603399 Compare March 6, 2026 09:58

coderabbitai bot reviewed Mar 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Energon data loading incompatibility with updated Qwen3-VL finetuning pipeline#2680

Fix Energon data loading incompatibility with updated Qwen3-VL finetuning pipeline#2680
aub123 wants to merge 1 commit intoNVIDIA-NeMo:mainfrom
aub123:fix/energon-qwen3vl-loader

aub123 commented Mar 6, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Mar 6, 2026

Uh oh!

This comment was marked as off-topic.

Uh oh!

coderabbitai bot commented Mar 6, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aub123 commented Mar 6, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

Changes

Notes

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Mar 6, 2026

Uh oh!

This comment was marked as off-topic.

Uh oh!

coderabbitai bot commented Mar 6, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

aub123 commented Mar 6, 2026 •

edited by coderabbitai bot

Loading